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Abstract 

In this study, we investigated the evolution of vertebrate tissues by examining the potential association 
among gene expression, duplication, and base substitution patterns. In particular, we compared whole- 
genome duplication (WGD) with small-scale duplication (SSD), as well as tissue restricted with ubiqui- 
tously expressed genes. All patterns were also analysed in the light of gene evolutionary rates. Among 
those genes characterized by rapid evolution and expressed in a restricted range of tissues, SSD was repre- 
sented in a larger proportion than WGD. Conversely, genes with ubiquitous expression were associated 
with slower evolutionary rates and a larger proportion of WGD. The results also show that evolutionary 
rates were faster in genes expressed in endodermal tissues and slower in ectodermal genes. 
Accordingly, the proportion of the SSD and WGD genes was highest in the endoderm and ectoderm, re- 
spectively. Therefore, quickly evolving SSD genes might have contributed to the faster evolution of endo- 
dermal tissues, whereas the comparatively slowly evolving WGD genes might have functioned to maintain 
the basic characteristics of ectodermal tissues. Mesenchymal tissues occupied an intermediate position in 
this regard, whereas the patterns observed for haemocytes were unique. Rapid tissue evolution could be 
related to a specific gene duplication mode (SSD) and faster molecular evolution in response to exposure 
to the external environment. These findings reveal general patterns underlying the evolution of tissues 
and their corresponding genes. 
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1 . Introduction 

Mutations are the main factors driving genome evo- 
lution and may happen within genes through base 
substitutions or involve their entire duplication.' In 
the latter case, two mechanisms have been recog- 
nized: whole-genome duplication (WGD) and small- 
scale duplication (SSD), which occurs in relatively a 
small region of the genome during evolution. For in- 
stance, the early vertebrate ancestor is thought to 



have undergone two rounds of WGD,^~^ as suggested 
by four vertebrate Hox gene clusters located in differ- 
ent chromosomes. Paralogous genes originated from 
WGD are referred to as ohnologues.^ Singletons, in 
turn, are genes that did not undergo either WGD or 
SSD. 

In general, duplicated genes are redundant and 
their functions overlap. Thus, in simple organisms 
such as yeasts and nematodes, the proportion of es- 
sential genes in duplicated genes is half as low as 
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that in singletons/'^ In mice, however, the proportion 
of essential genes is comparable between duplicated 
genes and singletons.^'' ° Furthermore, ohnologues 
are likely to contain a larger proportion of essential 
genes than SSD genes.' ' Ohnologues are indeed asso- 
ciated with development, the regulation of transcrip- 
tion, and protein complex formation.'' For these 
genes to function properly, the relative amount of 
their products must be balanced.' ^"'"^ WGD typically 
involves the simultaneous duplication of all genes, 
therefore preserving the relative dosages of each 
gene. Because either the loss or the gain of ohnolo- 
gues may lead to dosage imbalance, ohnologues are 
expected to be retained intact in the genome.'^ 
Similarly, ancient WGD-derived ohnologues are 
expected to be more conservative than recently 
evolved SSD genes. 

Gene expression can be tissue-specific, determining 
phenotypes such as the morphology and function of 
tissues, or ubiquitous. Previous research indicates 
that ubiquitously expressed genes are likely to evolve 
slowly."^"' ^ For example, in humans and mice, the 
orthologous genes that are expressed in a limited 
number of tissues tend to evolve faster than ubiqui- 
tously expressed genes.' ^ Little is known, however, 
about the extent to which the evolution of tissues is 
influenced by the differential modes of gene duplica- 
tion and expression. Indeed, there are no reports ex- 
ploring the effects of gene duplication events, such 
as WGD and SSD, on tissue-restricted or ubiquitous 
gene expression. 

In this study, the potential associations between 
gene evolutionary rates, duplication (WGD, SSD, and 
singletons), and gene expression breadth in different 
tissues (restricted or ubiquitous) were investigated. 
In addition, these parameters were also analysed in 
relation to the developmental origin of tissues (endo- 
dermal, mesenchymal, or ectodermal). The results 
support the notion that both base substitutions 
within genes and gene duplication are associated 
with gene expression breadth and that the nature of 
duplication (WGD or SSD) differs substantially de- 
pending on the germ-layer origin of the tissue. 
Tissue evolution is therefore discussed here as the 
outcome of a process involving the gene evolutionary 
rate, duplication, and expression. 

2. Materials and Methods 

2.1 . Classification of human genes based on the gene 
duplication mode 
Protein-coding genes of human origin were 
obtained from EnsembI release 52 (http://www. 
ensembl.org). A total of 7294 ohnologues and 
902 7 SSD genes were defined as described in 



Makino and McLysaght.'^ Briefly, duplicated genes 
were so judged when the two aligned sequences 
showed homology in their >30% length with e< 
10~^ in BLAST search. Ohnologues were syntenic 
genes located on paralogous chromosomal regions 
and derived from WGD, whereas SSD genes were 
duplicated genes not experiencing WGD. Of the 
902 7 SSD genes, 1478 genes were classified as both 
ohnologues and SSD genes and, therefore, excluded 
from the analysis, resulting in 7549 pure SSD genes. 
An additional 6064 genes were classified as single- 
tons. Thus, 20 907 genes (7294 + 7549 + 6064 = 
20 907) were considered. 

To define the origin of SSDs, a sequence similarity 
search was performed within protein-coding human 
genes using the all-against-all BLASTP program. 
Synonymous substitution rates (/<s) were estimated 
for each close paralogue. There were 2510 and 5039 
SSD genes for which !<s were <1 (recent SSD) and 
>1 (old SSD), respectively (251 0 + 5039 = 7549). 

2.2. Human orthologous genes and gene 
evolutionary rates 

Human genes with orthologues in mice and other 
species were obtained from EnsembI release 52. 
Orthologous sequences were aligned using 
CLUSTALW,'^ and Ks and non-synonymous substitu- 
tion rates (/<a) were deduced for each orthologous 
pair using the method of Yang and Nielsen^° imple- 
mented in PAML.^' Next, w values Q<a/1<s) were 
calculated. 

2.3. Human EST data 

An expression sequence tag (EST) database^^ was 
used to determine the expression profile of human 
genes in various tissues. The data were registered at 
NCBI and included 3 1 99 559 reads from 47 different 
human tissues. Each tissue contained more than 
10 000 ESTs (68 076 on an average). Using these 
47 tissues, the breadth of expression of a gene was 
represented by the number of tissues in which the 
gene was identified, as based on the detection of 
ESTs.^^ Thus, breadth varied from 1 to 47. Among 
the 20 907 human genes considered for analysis, 
there were 3871 for which ESTs were not identified 
in the EST database. These genes were excluded 
from further analyses, therefore resulting in a total 
number of genes of 1 7 036, among which 6 952 
were ohnologues, 5505 were SSDs, and 4579 were 
singletons. 

2.4. Classification of human tissues based 
on the developmental origin 

Forty-three tissues were classified into four 
subgroups based on their developmental origin, as 
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follows: endoderm (nasopharynx, thymus, stomach, 
colon, bladder, liver, trachea, lung, pancreas, uterus, 
cervix, prostate, and intestine), ectoderm (breast, 
skin, caudate nucleus, hypothalamus, eye, thalamus, 
subthalamic nucleus, cerebellum, hippocampus, 
corpus callosum, nervous system, amygdala, and sub- 
stantia nigra), mesenchymal (synovium, kidney, 
adipose tissue, bone, adrenal gland, cartilage, 
muscle, pericardium, and heart), and haemocytes (B 
cells, bone marrow, germinal centre B cells, blood 
cells, lymph node, spleen, blood vessels, and T lym- 
phocytes). The developmental origin of four tissues, 
such as ovary, testis, amnion, and placenta, was not 
categorized into any of the above four subgroups. 

One caveat of this classification, however, is that the 
intestine is classified as an endodermal tissue based 
on the presence of endoderm-derived intestinal epi- 
thelium. However, as a macroscopic organ, the intes- 
tine includes not only epithelium but also 
mesoderm-derived tissues such as the submucosa 
and muscles. Thus, in the above classification, an 
endodermal tissue usually includes both endoderm- 
and mesoderm-derived tissues, whereas an ectoder- 
mal tissue includes both ectoderm- and mesoderm- 
derived tissues. 

2.5. Definition of tissue evolutionary rates 

The tissue evolutionary rate was originally calcu- 
lated by Kuma et al}^ and this method has been 
adopted in the succeeding studies.' ^'^'^ Thus, we 
simply employed their definition in the present 
study. We assumed that a set of genes are expressed 
in a given tissue type and that the w values of expres- 
sing genes are X^, X2, ■ ■ ■, (for ohnologues), Yi, Y2, 
. . ., Y„ (for SSD genes), and Z, , Z2, . . ., (for single- 
tons). The evolutionary rate of this tissue is then 
given by: [(X, +X2 + - + + (Y, +Y2 + - + Y„) + 

(Zi+Z2H \- Zo)]/(m + n + 0), where m, n, and 0 

are the respective number of ohnologues, SSD genes, 
and singletons. 

3. Results 

3.1 . Association between expression breadth 
and duplication mode 
Figure 1 shows the gene number distribution over 
various expression breadths (note that the actual 
number of genes expressed in each tissue at each ex- 
pression breadth is shown in Supplementary Table 
SI ). Figure 1 A-C displays the numbers of ohnologues, 
SSD genes, and singletons, respectively. The average 
number of tissues in which each gene type was 
expressed was 1 8.7 for ohnologues (blue), 1 6.6 for 
SSD genes (red), and 1 8.0 for singletons (green). 
Figure 1 D shows that the proportions of each gene 



type in each expression breadth were noticeably dif- 
ferent. Specifically, the proportion of SSD genes was 
relatively higher among breadth-restricted genes 
(see /7 = 1 , n <3, n < 5, and n < ^0), whereas the 
proportion of ohnologues was increasingly higher 
among genes expressed in a larger number of tissues 
(see n > 1 0, /7 > 20, and n > 40). 

3.2. Association between gene evolutionary rates 
and duplication mode 

Ohnologues are believed to be more conservative 
than SSD genes with respect to functional essentiality 
and dosage-balance requirement."''^ To determine 
whether ohnologues are also conservative in terms 
of molecular evolution, non-synonymous nucleotide 
divergence in the coding region (between humans 
and mice) were examined. Table 1 shows that the 
average w value of ohnologues (0.11) was 0.55- 
0.57-fold lower than that of SSD genes (0.19; P< 
2.2 X 10""^, the Mann-Whitney L/-test) and of sin- 
gletons (0.20; P<2.2 X 10"'*^). The result confirms 
the conservative nature of ohnologues in the evolu- 
tion of coding regions. 

3.3. Tissue evolutionary rates in restricted 
expression breadths 

In Fig. 2A, tissue evolutionary rates are plotted in an 
increasing order, for each of 47 tissues, and under a 
condition of « < 1 0 (see 'Proportions of ohnologues 
and SSD genes in various expression breadths' as for 
a rationale of the use of f7 < 1 0). The w values grad- 
ually increased from 0.13 in a slowly evolving, left- 
ward tissue to 0.2 9 in a fast evolving, rightward tissue. 

3.4. Gene evolutionary rate in restricted 
expression breadths 

The average w values were calculated separately for 

each type of gene as follows: (Xi + X2 H 1- Xm)/m for 

ohnologues, (Yi + Y2 H h Y„)/n for SSD genes, and 

(Zi +Z2H \-Zo)/o for singletons. Figure 2B shows 

that the evolutionary rate (w) of ohnologues (blue), 
SSD genes (red), and singletons (green) increased 
gradually in line with increases in tissue evolutionary 
rates, indicating that all gene types evolved in parallel 
with the tissue evolution. Therefore, gene evolution- 
ary rates were higher in the genes that are expressed 
in fast evolving tissues irrespective of the gene type, 
whereas those were lower in slowly evolving tissues re- 
gardless of the gene type. It is worth noticing, 
however, that the average co value of ohnologues 
was low (0.1 3), whereas that of SSD genes and single- 
tons was high (0.2 6 and 0.2 9, respectively; Fig. 2B 
and see n < 1 0 in Table 1 ). 
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Figure 1. The number and proportion of ohnologues, SSD genes, and singletons at each tissue expression breadth (as determined by the 
number of tissues where a corresponding EST was detected for a respective gene). The number of genes at each expression breadth was 
counted for ohnologues (A, blue), SSD genes (B, red), and singletons (C, green). (D) The percentage of ohnologues, SSD genes, and 
singletons at each breadth. 



Table 1. Average Ka/'^s values 



Category 


Subcategory 


Number of compared 
orthologous gene pairs 


Average Ka/I<s 


SD 


Expressed genes 


Ohnologues 


6952 


0.1 1 


0.1 1 




Total SSD genes 


5505 


0.1 9 


0.1 6 




Old SSD genes (K^ > 1 ) 


4356 


0.1 7 


0.1 4 




Recent SSD genes (K^ < 1 ) 


1 149 


0.29 


0.20 




Singletons 


4579 


0.20 


0.22 


Narrowly expressed genes (« < 1 0) 


Ohnologues 


2151 


0.1 3 


0.1 2 




Total SSD genes 


2008 


0.26 


0.1 8 




Old SSD genes (Ks > 1 ) 


1 338 


0.22 


0.1 6 




Recent SSD genes (K^ < 1 ) 


670 


0.34 


0.20 




Singletons 


1 1 01 


0.29 


0.19 


Broadly expressed genes (« > 1 0) 


Ohnologues 


4801 


0.1 0 


0.1 1 




Total SSD genes 


3497 


0.1 5 


0.14 




Old SSD genes (K^ > 1 ) 


301 8 


0.14 


0.1 2 




Recent SSD genes (K^ < 1 ) 


479 


0.23 


0.19 




Singletons 


3478 


0.1 7 


0.23 



No. 4] 

3.5. Proportions of gene type in restricted 
expression breadths 

Figure 2C shows that an increase in tissue evolu- 
tionary rates was accompanied by a corresponding 
decrease in the proportion of ohnologues and an in- 
crease in the proportion of SSD genes. This result indi- 
cates that ohnologues and SSD genes tended to be 
expressed in slowly and fast evolving tissues, respect- 
ively. The proportion of singletons also appeared to in- 
crease in parallel with tissue evolutionary rates. 

Figure 2B and C also indicates that ohnologues and 
SSD genes behave differently. In the case of SSD genes, 
w values as well as proportions are positively asso- 
ciated with tissue evolutionary rates, suggesting that 
SSD genes contribute to faster tissue evolution. In 
the case of ohnologues, however, an increase in w 
was associated with a decrease in proportion, accom- 
panying the tissue evolutionary rate. Therefore, 
further analysis is needed to examine the relative 
contribution of ohnologues to tissue evolution. 



3.6. Differential contribution of olinologues, SSD 
genes, and singletons among genes witli 
restricted expression 
Both the evolutionary rate of each gene type and 

their proportion are incorporated in the following 

formula: 

(Xi +X2 + ■ • • +X„) + (Yi + 72 + • • • + y„) 

+(Zi +Z2 + ---+Zo) 

m + n + 0 

Xi + X2 H h m 

— X 

m m + n + 0 

Y, + Y2 + + n 

H X 

n m + n + 0 

Zi+Z2 + ---+Zo 0 

H X , 

0 m + n + 0 

where [(X, + X2 H h Xj/m] x [m/(m + n + 0)] 

represents ohnologues, [(Vi + ^2 H 1- V'J/n] x [n/ 

{m + n + 0)] represents SSD genes, and [(Z, +Z2 + - 
+ Zo)/o] X [o/(m + n + 0)} represents singletons. 
Figure 2D shows each ohnologue, SSD, and singleton 
component in relation to tissue evolutionary rates. 
For SSD genes and singletons, there was a positive as- 
sociation between tissue and gene evolutionary rates. 
In contrast, ohnologues were almost flat irrespective 
of tissue evolution. This flat line for ohnologues sug- 
gests that ohnologues did not play a major role in 
tissue evolution, in agreement with the notion that 
ohnologues are conservative in nature. However, this 
does not necessarily mean a lack of any role for ohno- 
logues in tissue evolution. In fact, the proportions 
of ohnologues were substantially reduced in fast 



309 

evolving tissues. This effect and the increase in the w 
values cancelled out each other. 

3.7. Equal contribution of ohnologues, SSD genes, 
and singletons among genes with ubiquitous 
expression 

Figure 2E-H represents the results obtained under 
a condition of n > 1 0 tissues and show virtually no 
variation (flat lines) for all parameters measured 
and gene types among tissues. This is not surprising 
given the ubiquitous nature of the expression of 
each gene type in a broad range of tissues. The 
average co values when n > ^0 tissues were remark- 
ably lower, particularly for SSD genes and singletons 
(0.1 5 and 0.1 7, respectively), than those in observed 
when n<^0 tissues (0.26 and 0.29, respectively; 
Fig. 2F and B and Table 1 ). In terms of the proportion 
of gene types (Fig. 2G), the proportion of ohnologues 
(40%) was substantially higher than that of SSD genes 
and singletons (30% in both cases). This result is in 
sharp contrast to that found for expression breadths 
of n < 1 0 tissues, indicating that among those genes 
that are expressed ubiquitously, there is a larger pro- 
portion of ohnologues and a lower proportion of 
SSD genes and singletons. Finally, the relative contri- 
bution of ohnologues, SSD genes, and singletons is 
nearly the same for all tissues (Fig. 2H), which is in 
high contrast to the situation where n < 1 0 tissues 
(Fig. 2D). 

Overall, the results show that the relative propor- 
tion of ohnologues, SSD genes, and singletons, as 
well as their evolutionary rates, are substantially dif- 
ferent between genes with ubiquitous and restricted 
expression breadths. 

3.8. Proportions of ohnologues and SSD genes 
in various expression breadths 

To confirm the previous findings, we analysed cases 
involving expression breadths other than n < 1 0 and 
n > 1 0 tissues. Similar results to those shown in 
Fig. 2A-D were obtained for « = 1 , n < 3, n < 5, n < 
20, and n < 40 tissues, whereas similar observations 
as those indicated in Fig. 2E-H were obtained for 
n > 20 and « > 40 tissues (data not shown). 
Figure 3 shows the proportion of ohnologues (blue) 
and SSD genes (red) for each tissue under various ex- 
pression breadths (n=l,n<3,n<5,n<10, f7<2 0, 
n < 40, « > 1 0, /7 > 2 0, and n > 40). Thex-axis repre- 
sents the tissue evolutionary rate, whereas the y-axis 
represents the proportions of ohnologues and SSD 
genes. In tissue-restricted expression (n=l, n<3, 
n < 5, n < 20, and n < 40), the faster the tissue evolu- 
tionary rates, the higher the proportion of SSD genes 
and the lower the proportion of ohnologues expressed 
(correlation coefficients and P-values are shown in 



Tissue Evolution Driven by Gene Duplication 



310 



M. Satake et al. 



[Vol. 1 9, 




Figure 2. Contribution of ohnologues, SSD genes, and singletons among breadth-restricted (n < 10 tissues) and breadth-ubiquitous (« > 1 0 
tissues) genes. Blue, red, and green representtheohnologues, SSD genes, and singletons, respectively. Tissues are aligned on thex-axis in the 
order of magnitude of the I<a/I<s (<^) values of their expressed genes. Parameters used in the y-axis are as follows: (A and E) the average <x> 
values of all genes expressed in a given tissue, which corresponds to the defined tissue evolutionary rate; (B and F) the co values of 
ohnologues, SSD genes, and singletons; (C and G) the proportion of ohnologues, SSD genes, and singletons; and (D and H) the m values 
and the proportions calculated for ohnologues, SSD genes, and singletons. The order of tissues aligned are as follows from the left to the 
right. In (A)-(D), the nervous system, subthalamic nucleus, amygdale, cerebellum, cartilage, substantia nigra, hypothalamus, 
hippocampus, pericardium, corpus callosum, thalamus, T lymphocytes, ovary, eye, heart, caudate nucleus, adipose tissue, pancreas, 
skin, muscle, adrenal gland, prostate, breast, lymph, kidney, colon, amnion, cervix, placenta, stomach, bladder, lung, uterus, bone, 
germinal centre B cell, intestine, trachea, bone marrow, B cells, testis, liver, spleen, thymus, blood vessels, synovium, blood, and 
nasopharynx. In (E)-(H), the amnion, cartilage, B cells, skin, muscle, nervous system, hypothalamus, cervix, adipose tissue, substantia 
nigra, caudate nucleus, corpus callosum, T lymphocytes, subthalamic nucleus, heart, bone marrow, adrenal gland, lymph, blood vessels, 
bone, bladder, amygdale, ovary, hippocampus, pancreas, breast, colon, blood, eye, pericardium, cerebellum, thalamus, prostate, 
stomach, germinal centre B cell, kidney, liver, spleen, lung, uterus, testis, placenta, intestine, thymus, synovium, trachea, and nasopharynx. 
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Supplementary Table S2). When the expression 
breadth was < 1 0, a statistically significant correlation 
was detected for both the ohnologues and SSD genes 
(P= 1.6 X 1 0"^^ and P= 2.1 x 1 0"^ respectively). A 
similar correlation was observed when the expression 
breadth was set at <20 and <40. However, the cor- 
relation became rather weak in these cases. 
Therefore, at expression breadth <10, correlation 
became maximum and most reliable. 

On the other hand, in those cases where n > 20 
and n > 40 tissues, the proportion of ohnologues 
was higher and the proportion of SSD genes was 
lower but there was no significant correlation 
between tissue evolutionary rates and the proportion 
of either gene type. 

3.9. Orthologous genes between human 
and other vertebrates 

The previous analyses involved genes that are 
orthologous in human and mice. However, the 
origin of ohnologues can be traced back to the emer- 
gence of vertebrates. Therefore, human orthologues 
in various vertebrates, including rat, cow, dog, 
opossum, and chicken, were extracted from Ensembl. 
Using an expression breadth of n <^0, analyses 
similar to those shown in Fig. 2C were performed. In 
the results shown in Supplementary Fig. SI, human 
genes were used as references for pairwise compari- 
sons. Similar to the results shown in Fig. 2C, the com- 
parison between humans and other species 
(Supplementary Table S3) shows that those tissues 
with higher co values were associated with a larger 
proportion of SSD genes than with ohnologues. 

3.10. Recent SSD genes contribute to faster 
tissue evolution 

Approximate gene duplication times can be esti- 
mated based on /<s values, with lower and higher l<s 
values corresponding to recent and ancient duplica- 
tion events, respectively. To examine the relative con- 
tribution of each SSD gene to tissue expression, a 
threshold value of /<s was set to 1 .0 and the number 
and the percentage of each SSD gene at various ex- 
pression breadths were counted (Fig. 4A-C). 
Narrower expression breadths were associated with a 
larger percentage of recent SSD genes (/<'s<1.0, 
red), whereas the percentage of ancient SSD genes 
{l<s > 1 .0, pink) was nearly constant regardless of ex- 
pression breadth. These results indicate a higher con- 
tribution of recent SSD genes to tissue-restricted gene 
expression. 

An additional analysis similar to that shown in 
Fig. 2C was performed using recent and ancient SSD 
genes as criteria. The results are shown in Fig. 4D (ex- 
pression breadth <1 0 tissues). Co-linearity between w 



values and gene proportions was 0.58 for recent (P = 
1.7 X 10"^) and 0.33 for ancient SSD genes (P = 
0.024), indicating a larger contribution of recent 
SSD genes to the positive co-linearity among all SSD 
genes observed in Fig. 2C. Yet, the slightly positive 
co-linearity observed for ancient SSD genes indicates 
that ancient SSD genes are distinct from ohnologues, 
which showed negative co- linearity. These results in- 
dicate that the evolution of SSD genes of both 
recent and ancient origins, as well as of ohnologues, 
is distinct in nature. 



3.11. Association between gene evolutionary rates, 
duplication mode, and developmental origin 
of tissues 

In the analyses of the results shown in Fig. 2, fast 
and slow tissue evolutionary rates were based on w 
values but we did not address tissue types. To 
examine how w values and the proportion of SSD 
genes and ohnologues are associated with the devel- 
opmental origin of tissues, w values as well as the pro- 
portions of ohnologues, SSD genes, and singletons 
were calculated for each tissue type. Values calculated 
for the four subgroups with a breadth of n < 1 0 
tissues are presented in Fig. 5 (Supplementary Table 
S4 shows P-values). 

The average w values were significantly higher in 
endodermal, intermediate in mesenchymal, and 
lower in ectodermal tissues (Fig. 5A). Conversely, 
the proportion of ohnologues was higher in ecto- 
dermal than in endodermal and mesenchymal 
tissues (Fig. 5B), whereas the proportion of SSD 
genes was highest in endodermal, intermediate in 
mesenchymal, and lowest in ectodermal tissues 
(Fig. 5C). The proportion of singletons did not vary 
among the three subgroups (Fig. 5D). Therefore, it 
is possible that SSD genes with higher w values 
might have contributed to the faster evolution of 
endodermal tissues, whereas ohnologues with 
lower CO values might have functioned to maintain 
the essential characteristics of ectodermal tissues. 
Mesenchymal tissues occupied an intermediate 
position. 



3. 12. The unique evolutionary position of haemocytes 
Haemocyte genes displayed unique features that 
were distinct from those of the three subgroups of 
tissues described previously. Specifically, the average 
(o value of haemocyte genes was the highest, 
whereas the proportion of haemocyte ohnologues 
was the lowest. Furthermore, the proportion of hae- 
mocyte singletons was higher than that of the other 
three subtypes of tissues. 
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Figure 3. Proportion of ohnologues and SSD genes at various gene expression breadtlis. Tlie y-axis represents the proportions of oPinoiogues 
(blue) and SSD genes (red), whereas thex-axis shows the tissue evolutionary rates [the average /<a/'<^s (<") value of all genes expressed in a 
given tissue]. P-values and linear correlation coefficients are shown in Supplementary Table S2. 



4. Discussion 

4.1. Gene evolutionary rate, duplication, 
and expression breadth 
In the present study, the putative association 
between gene evolutionary rates, duplication, and 
their expression breadth in different tissues was exam- 
ined. Gene evolutionary rates are affected by various 



factors. For example, duplicated genes are believed 
to evolve relatively fast, whereas singletons evolve 
more slowly.' Another important factor is the expres- 
sion breadth of each gene. Genes expressed in a wide 
range of tissues tend to evolve more slowly, whereas 
those restricted to a narrow range of tissues evolve 
faster.' ^'^^ In addition, as the number of genes in a 
gene family increases, the number of tissues in 
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Figure 4. Recent and ancient SSD genes. (A-C) The number and the percentage of ancient (pink) and recent (red) SSD genes are shown 
for each tissue expression breadth. (D) The proportions of ancient and recent SSD genes among genes with tissue-restricted expression 
limited to /i < 1 0 tissues. 



which the corresponding gene is expressed 
decreases.^^ However, these findings were based on 
the analysis of only two parameters at a time, even 
though co-linearity among the three elements con- 
sidered (gene evolutionary rate, gene duplication, 
and gene expression breadth) is also possible. 

To analyse the involvement of duplicated genes 
on tissue evolution, we initially focused on the rela- 
tionship between gene expression breadth and gene 
duplication. However, the percentages of singletons 
(Fig. 1 D, green) and duplicated genes (Fig. 1 D; red + 
blue) remained nearly constant at 30-35 and 
65-70%, respectively, and were independent of 
gene expression breadth. This result suggests that the 



association between gene expression breadth and 
gene duplication is not as simple as reported 
previously. 

The present study not only considered the analysis 
of gene evolutionary rates but also incorporated into 
it the distinction between SSD genes and ohnologues 
when considering gene duplication. Gene evolu- 
tionary rates were fast and slow for SSD genes and 
ohnologues, respectively (Table 1). Interestingly, SSD 
genes were represented in a larger proportion within 
narrow expression ranges, whereas the opposite was 
observed for ohnologues (Fig. 1 D). The present study 
is therefore the first to show co-linearity among gene 
evolutionary rate, duplication, and expression breadth. 
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Figure 5. Contribution of ohnologues, SSD genes, and singletons to gene sets expressed in haemocytes, tissues of endodermal and 
ectodermal origin, and mesenchymal tissues. (A) The average K/^/Ks (a>) values associated with each gene set. (B)-(D) The 
proportion of ohnologues, SSD genes, and singletons expressed in each tissue type. In each panel, averages and standard deviations 
are shown. P-values for pairwise comparisons are shown in Supplementary Table S4. 



4.2. Contribution of SSD genes, ohnologues, and 
singletons to expression breadths in tissues 
Genes that are expressed in a narrow range of 
tissues are considered to play significant roles in the 
establishment of the characteristic features of the cor- 
responding tissues. Prior studies have shown that 
those genes that are expressed specifically in neural 
tissues evolve more slowly, whereas those expressed 
in other tissue types evolve faster.^ ^'^^ Alternatively, 



a substantial proportion of those genes with a 
tissue-specific expression tend to encode secreted 
polypeptides irrespective of tissue types, and there 
are few such genes in neural tissues.^"^ 

In this study, we also examined if the nature of SSD 
genes depended on the type of tissue in which the 
corresponding genes were expressed. To this end, 
the evolutionary rate of a tissue was defined as the 
average w value of those genes expressed in that 
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tissue, whereas the nature of the expressed genes was 
evaluated by their w values, proportions as well as by 
the product of these parameters. These parameters 
were calculated for each type of gene (SSD, ohnolo- 
gues, and singletons) and plotted for each tissue 
(Fig. 2). Among genes with a narrow spectrum of ex- 
pression, SSD genes (and singletons) played a minor 
role in slowly evolving tissues with low w values. 
However, the evolutionary rates as well as the propor- 
tion of SSD genes were associated positively with the 
evolutionary rates of tissues. Thus, SSD genes (and sin- 
gletons) seemed to play major roles in fast evolving 
tissues with high co values, indicating that the relative 
contribution of SSD genes differs substantially de- 
pending on the type of tissue in which the gene is 
expressed. In contrast, genes with a wide spectrum 
of expression are presumed to maintain basic cellular 
functions irrespective of the tissue type. Among these 
genes, the relative contribution of SSD genes and 
ohnologues was similar. 

Our results indicate that ohnologues with slower 
rates of evolution and wide expression ranges are 
associated with slowly evolving tissues, whereas SSD 
genes with faster rates of evolution and narrow 
range of expression are associated with fast evolving 
tissues. Specifically, there were less ohnologues and 
more SSD genes in endodermal tissues such as the di- 
gestive tract. These tissues face the outer environment 
directly and therefore they may differentiate function- 
ally and pleiotropically to adjust to environmental 
changes. Rapid tissue evolutionary rates could be 
therefore related to a specific mode of duplication 
(SSD genes) and faster molecular evolution (high w 
values) in response to this exposure to the outer 
environment. 

Conversely, in ectoderm-derived tissues, such as the 
nervous system, there were more ohnologues and less 
SSD genes. Because these tissues are not exposed to 
the outer environment, their evolution may not be 
driven by functional differentiation but rather by the 
need to maintain basal functions. Ohnologues with 
a slow evolutionary rate and lower probability of 
further duplication were therefore predominant in 
slowly evolving tissues such as the ectoderm. 
Mesenchymal tissues were associated with an inter- 
mediate phenotype in terms of co values and the pro- 
portion of ohnologues and SSD genes. 

Those genes with expression restricted to haemo- 
cytes exhibited very unique features. The percentage 
of ohnologues was the smallest among all tissue 
types, whereas singletons were the most representa- 
tive, a result not observed for the other three tissue 
types. A lower proportion of conservative ohnologues 
might allow the relaxation of those constraints 
required for the formation and functionality of 
tissues. This flexibility might be further accelerated 



by an increase in the ratio of singletons. Such findings 
might be expected in light of the dramatic transitions 
of haematopoietic organs and tissues from the aorta, 
gonads, and mesonephros regions to the liver, spleen, 
and bone marrow during mammalian development 
and vertebrate evolution. Furthermore, the haemo- 
cytes used in this study included immune-competent 
cells and tissues, and immune-related genes are 
known to evolve fast (immunoglobulin and T cell re- 
ceptor gene families were excluded from the analysis 
here). In this sense, endodermal tissues and haemo- 
cytes appear to adopt distinct strategies to increase 
the proportion of SSD genes and singletons, respect- 
ively, when adapting to environmental changes. 
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